home *** CD-ROM | disk | FTP | other *** search
Text File | 1993-10-29 | 101.6 KB | 2,145 lines |
-
-
- IDMR Working Group Steve Deering
- INTERNET-DRAFT Xerox PARC
- Expires April 1993 Deborah Estrin
- <draft-ietf-idmr-igmp-sparse-00.txt> USC/ISI
- Dino Farinacci
- cisco Systems
- Van Jacobsen
- LBL
- October 18, 1993
-
-
- IGMP Router Extensions for Routing to Sparse Multicast-Groups
-
-
- Status of this Memo
-
- This document is an Internet Draft. Internet Drafts are working
- documents of the Internet Engineering Task Force (IETF), its Areas,
- and its Working Groups. Note that other groups may also distribute
- working documents as Internet Drafts).
-
- Internet Drafts are draft documents valid for a maximum of six
- months. Internet Drafts may be updated, replaced, or obsoleted by
- other documents at any time. It is not appropriate to use Internet
- Drafts as reference material or to cite them other than as a "working
- draft" or "work in progress."
-
- Please check the I-D abstract listing contained in each Internet
- Draft directory to learn the current status of this or any other
- Internet Draft.
-
-
- **************************************************************************
- *********************P*L*E*A*S*E*****R*E*A*D******************************
-
- ***October 18th version solves the problem of deadtimes between time when
- receiver DR sets up S,G and when the associated join is processed all the
- way upstream (the problem was that as soon as S,G is set up then packets
- arriving on *,G would get dropped because the incoming interface does not
- match). The fix is that you associate a bit with the S,G entry and when
- it is cleared, packets that do not match the incoming interface are
- checked agains *,G before being dropped. If they match *,G, they are
- forwarded accordingly. Once a router sees a packet that matches S,G (both
- longest match for source and incoming interface) it sets the bit
- associated with S,G and from then on data packets must match the incoming
- interface for S,G or be dropped. A DR waits to send a prune up the
- RP,tree until the SPT bit for the S,G entry is set.
-
- ***DISCLAIMER: In my usual rush to get a revised version of this document
- to the IDMR list, Deering, Farinacci, and Jacobson did not get the chance to
- review the recent detailed changes to the document. I will be out of
- town and largely off-the-net from October 18 until IETF. please send
- comments to the list anyway and hopefully someone will be able to respond
- in a timely manner. At the very least, we will collect them at IETF and
- respond to them in real time.
-
- ***At the Amsterdam IETF we thought that we could eliminate S,G state on
- the shared tree. After chasing our tails for a while we realized that in
- order to maintain our ability to do an incoming interface check on data
- packets, to detect multicast loops, we needed S,G specific state "offtree" of the
- shared tree (to use CBT terminology). However, to deal with sparse
- traffic and with data packets sent before the S,G state is established we
- support encapsulation of data packets in registers so that it is possible
- that the S,G state might never be set up if it is not deemed necessary
- (data packets are sporadic and few) and that at least data packets do not
- need to be dropped (or buffered) until the S,G state is setup.
-
- ***This brings up an important point, that we believe it critical to
- do incoming interface checks on all multicast data packets because of the potentially
- severe consequences of looping multicast packets. Any multicast protocol
- that we design should have this capability.
-
- ***The biggest open issue with respect to this scheme is the need to
- introduce an aggregation mechanism for S,G state and messages. Van has
- proposed something called proxy. It looks doubtful that something will
- get written up by IETF, but it will be the primary agenda item
- immediately after.
-
- ***The major changes to this document since the last release is that we
- 1. added back the S,G state upstream of the RP on the shared tree.
- 2. added a mechanism to disambiguate which RP is being used in the
- multiple RP case (see additional check added to ESL-Join processing
- and additional check added to RP-reachability message processing).
- 3. Data packets travel encapsulated in register packets until the state
- is established for S,G (e.g. the RPs' join propagates upstream to the
- first hop router)
- 4. Register packets are sent unicast to the RP who sets up S,G and sends
- joins upstream towards S to set up S,G state between the source and RP.
- 5. All *,G entries have as their incoming interface the interface used to
- reach the RP.
- 6. Added option for Register and RP-reachability messages to carry
- source, mask information downstream so that S,G entries can be set up
- with appropriate (subnet) mask information.
-
-
- ***Note: The name ESL is not perfect; at least I, DE do not think so.
- This is particularly true in light of the dense mode scheme which will
- really be part of the same protocol. But to avoid changing the name
- repeatedly we are sticking with ESL and down the rode when the
- protocols themselves stabilize, we will reopen the question of names...
- **************************************************************************
-
- 1. Introduction and Motivation
-
- This document describes a mechanism for efficiently routing to sparse
- multicast groups that span wide-area (and inter-domain) internets. The
- implementation of our approach is based on extensions to IGMP[RFC1112]
- and makes use of explicit source lists. We refer to the scheme as ESL.
-
- The mechanism proposed here complements existing multicast routing
- mechanisms such as those implemented in MOSPF and DVMRP. These traditional
- multicast schemes are well suited for use within local or wide area regions
- where a group is widely represented. However, when group members, and
- senders to those group members, are distributed SPARSELY across a wide
- area, these schemes are not efficient; data packets (in the case of DVMRP)
- or membership report information (in the case of MOSPF) are sent over many
- links that do NOT lead to receivers or senders, respectively. The Explicit
- Source List (ESL) extensions to IGMP, proposed here, efficiently establish
- multicast distribution trees to reach members of a multicast group in
- regions where members are NOT densely represented.
-
- Stated simply, when group members densely populate the internet, it is
- efficient to assume that most networks or subnetworks contain members,
- and to prune off the exception networks explicitly. In contrast, when
- group members are sparse, it is efficient to assume that most networks or
- subnetworks do NOT contain members, and to join on the exception networks
- explicitly. Thus the basis of the scheme described in this document is an
- explicit joining mechanism. In another document, in preparation, we will
- describe a companion multicast routing protocol for dense groups that is
- based on implicit joining and explicit pruning.
-
- The Core Based Tree (CBT) protocol supports sparse multicast groups
- with a shared (center-based) tree.[CBT] In contrast, the ESL protocol
- proposed here supports both center and source-specific distribution trees
- in order to provide higher quality data distribution, when needed.
-
- In the remainder of this section we enumerate our design goals and then
- present necessary background on existing Reverse Path Multicasting (RPM)
- mechanisms. Section 2 summarizes basic protocol operation and limitations.
- Section 3 describes the protocol in detail (including packet formats).
- Section 4 addresses robustness and Section 5 and 6 addresses interoperation
- with networks that do not implement ESL-multicast. Section 7 briefly
- compares our approach to CBT[cbt93] and addresses several open design
- issues.
-
- 1.1 Design Goals
-
- We had several design objectives in mind when designing this protocol:
-
- 1.1.1 Efficient Sparse Group Support
-
- Our primary design goal is efficient support for sparsely distributed
- wide-area multicast groups ("skinny trees"). We define a sparse group
- as one in which a) the number of networks/domains with members is
- significantly smaller than number of networks/domains in the Internet,
- so that traditional RPM or Link-State style multicast (e.g., MOSPF) is
- inefficient; b) group members span an area that is too large/wide
- to rely on scope control; and c) the internetwork spanned by the group is
- not sufficiently resource rich to ignore the overhead of current schemes.
- Sparse groups are not necessarily "small"; therefore we must
- support dynamic groups with large numbers of receivers.
-
-
- 1.1.2 High Quality Data Distribution
-
- We wish to support low-delay data distribution when needed by the
- application. In particular, we avoid IMPOSING a single shared tree in which
- data packets are forwarded to receivers along a common tree, independent of
- their source. Source-specific trees are superior when (a) multiple sources
- send data simultaneously and would experience poor service when the traffic
- is all concentrated on a single shared tree, or (b) the path lengths
- between sources and destinations in the shortest path tree (SPTs) are
- significantly shorter than in the shared tree. In particular, to support
- low-delay distribution for "continuous" media such as voice and video, it
- is desirable (and perhaps essential) to avoid concentrating the traffic of
- all sources onto a common distribution tree; it is preferable to spread the
- traffic load and have the data travel over source-specific trees. Moreover,
- for all types of traffic, if one wants to minimize path length, data
- packets should travel along a shortest path distribution tree rooted in the
- source; i.e., data packets are forwarded based on the {Multicast-Address,
- Source} tuple (as in MOSPF).
-
- For some applications or contexts group trees are appropriate; for example,
- resource-discovery applications where large numbers of sources transmit
- packets intermittently. However, shared trees should not be imposed as the
- only delivery option. In the scheme presented here a shared tree is
- maintained as a rendezvous mechanism for new receivers and sources;
- however, in steady state, data can be delivered over a shortest path from
- receiver to sender, or over the shared tree. We use the term rendezvous
- points (RPs) in this document because although similar, the role and
- function of an RP is different from a Core router in the CBT scheme.
-
- The protocol described here is based on Reverse Path Forwarding. All RPF
- schemes construct "reverse shortest path" trees. When unicast routes are
- symmetric (i.e., the shortest path from a source to receiver is the same
- as the shortest path from the receiver to the source), reverse shortest
- path trees will provide equivalent paths to forward shortest path trees.
- However, when unicast routes are not symmetric, the reverse shortest path
- may be longer than the forward shortest path. Nevertheless, the reverse
- shortest path tree delays will still be superior, in general, to the use
- of a shared tree.
-
-
- 1.1.3 Routing Protocol Independent
-
- The protocol should rely on existing unicast routing functionality to
- adapt to topology changes, but at the same time be independent of the
- particular protocol employed. This is achievable if the multicast
- protocol makes use of the unicast routing tables, independent of how
- those tables are computed.
-
- 1.1.4 Interoperability
-
- We need interoperability with traditional RPM and link-state multicast
- routing, both intra- and inter-domain. For example, a single conversation
- should allow intra-domain distribution to be established by IP-style
- multicast (e.g. MOSPF) and inter-domain distribution established by ESL;
- and to allow some inter-domain distribution to be established by ESL, and
- some by another inter-domain multicast approach designed for more-densely
- distributed groups. (This sidesteps the question of when to use which type
- of multicast routing approach.). However, to interoperate with some
- existing IGPs it will be necessary to impose some additional protocol or
- configuration overhead.
-
- In support of interoperation with IP multicast, AND in support of groups
- with very large numbers of receivers, we also wish to maintain the logical
- separation of roles between receivers and senders. We may introduce some
- optimization to support senders that are also receivers (this is often
- appropriate to the application), but we do not want to impose the dual
- role/symmetry.
-
-
- 1.2. Reverse Path Forwarding
-
- Before proceeding with a description of ESL we briefly describe existing
- RPM mechanisms. This provides more detailed motivation for the extensions
- proposed here, and provides background as to the existing mechanisms with
- which ESL must interoperate.
-
-
- 1.2.1 RPM mechanisms
-
- Reverse Path Forwarding (RPF) is a technique used to forward multicast
- datagrams. A router will forward a multicast datagram, from a given source,
- if it was received on the same interface that it uses to forward unicast
- datagrams to that source. Hence, the reverse path is interrogated before
- forwarding is performed. RPF forwards the data packet out all interfaces
- except the incoming. Reverse Path Broadcasting (RPB) eliminates some of the
- duplicate packets generated by RPF by only sending packets out on a subset
- of the outgoing interfaces; this subset is based on a local database of
- parent-child information obtained from the unicast routing protocol.
- Truncated RPF or RPB detects leaf networks without members and stops
- forwarding multicast packets for that group. Reverse Path Multicasting
- (RPM) has mechanisms for detecting and distributing prune information
- upstream of the truncated leaf networks to stop distribution of multicast
- packets to parts of the network that do not have members. RPM schemes
- periodically flush this prune information and revert back to RPF or RPB
- behavior. In addition, explicit graft messages may be used to undo pruning
- and thereby join new members into the distribution tree more quickly.
-
- 1.2.2 Router to Host interaction mechanisms
-
- Hosts inform routers which multicast groups they are members of. Routers
- use this information to tell other routers that group members are (MOSPF),
- or are not (DVMRP), present. IGMP is the protocol mechanism for Router to
- Host interaction for IP multicasting. In particular, Routers send IGMP Host
- Query packets periodically to ask hosts for group membership information.
- Hosts reply with IGMP Host Report packets for each multicast group they are
- a member of; however when a host hears a membership join report from
- another host on the same LAN, it will supress its join to avoid duplicates
- [IGMP]. Hosts do not have to inform routers explicitly when they want to
- send a multicast datagram.
-
- ESL uses this same mechanism to identify the location of group members.
- ESL will also use a router-to-router IGMP Query packet so adjacent
- multicast routers can be identified. See section 5 for details.
-
-
- 1.2.3 Limitations of Existing Multicast Routing Mechanisms
-
- Existing multicast routing mechanisms work efficiently when most networks
- have local members and therefore the default RPM treatment of data packets
- (as in DVMRP), or flooding of link state advertisements with local
- membership information (as in MOSPF) is appropriate. In this context the
- lack of local members is best treated as an exception by issuing a "Prune"
- message. However when most networks do NOT have local members there is
- significant overhead associated with these schemes. One of the major
- incentives to use multicast is efficient bandwidth usage (otherwise
- multicast routing support would not be needed to begin with). This is most
- critical in wide area and multi-domain internetworks where resources are
- not as uniformly available as in the local and campus network context.
-
- 1.2.4 Scope Control
-
- One way of limiting the overhead of multicast is to define a maximum
- number of hops that all messages will traverse. In this way the multicast
- group information will only be distributed within a limited region.
- This is a perfect mechanism for local groups. However, for groups that
- span wider areas, the scope would have to be set so high that the
- reverse path multicasting of data packets, or the flooding of membership
- information, would once again consume excessive resources.
-
-
- 1.3 Directly connected ESL routers
-
- A directly connected router is one that either a) shares a physical network
- with a receiver host (i.e., are both physically connected to a common
- multi-access network), or b) is in the same domain as a receiver and some
- other multicast routing protocol within the domain is used to distribute
- multicast packets and to signal membership. A Directly connected ESL router
- is the last hop ESL router, from the perspective of a receiver; it is the
- first hop ESL router from the perspective of a source.
-
-
-
- 2. Overview of the Explicit Source List (ESL) protocol design
-
-
- In the remainder of this document we describe a multicast
- routing mechanism for sparse groups which can be realized with
- relatively simple extensions to IGMP[RFC112].
-
- 2.1 ESL
-
- We introduce two new IGMP message types to establish distribution trees
- between sources and receivers (group members); routers send ESL-Join
- messages upstream towards sources and routers send ESL-Register messages
- downstream from sources to the RPs. An ESL-Join message contains both a
- join and prune list; the former enumerates the sources from which the
- downstream receivers wish to receive packets (via this router, i.e., this
- upstream router) and the latter indicates the sources from which the
- downstream receivers do not expect to receive packets (via this route). An
- ESL-Register message identifies the group and is sent directly to each of
- the RPs associated with the group. The ESL messages are sent to unicast
- addresses using raw IP.
-
- We can summarize the operation of the ESL scheme as follows. One or
- more Rendezvous Points (RPs) are used INITIALLY to propagate data
- packets from sources to receivers. An RP may be an ESL-speaking router
- that is close to one of the members of the group, or it may be some
- other host or router in the network. A sparse mode group, i.e., one
- that the receiver's directly connected ESL router will join using ESL.
- is identified by the presence of RP address(es) associated with the
- group in question. The mapping information may be configured or may be
- learned through another protocol mechanism.
-
- When sources start sending to a multicast group, the first hop
- ESL-router sends an ESL-Register message to the RP(s) for that group.
- When a receiver joins an ESL multicast group, its first hop ESL router
- sends an ESL-Join message towards one of the RPs. If source-specific
- distribution trees are desired, the first hop ESL router for each
- member (receiver) eventually joins the source-rooted distribution tree
- for each source by sending an ESL-Join message towards the source and
- after data packets are received on the new path, it sends
- an IGMP prune message toward the RP (assuming these represent
- different uplinks/branches). The state maintained in routers is the
- same as the forwarding information that is currently maintained by
- routers running existing IP multicast protocols such as MOSPF, i.e.,
- source (S), multicast address (G), outgoing interface (oif), incoming
- interface (iif). We refer to this forwarding information as the
- multicast forwarding entry for (S,G). The ESL messages sent upstream
- by receivers include an explicit list of the sources known to the
- downstream receivers (thus the name).
-
-
- 2.2 Design Tradeoffs
-
- Referring back to our design objectives, we selected the ESL approach over
- an alternative, source-initiated approach, where each receiver contacts
- each source and each source sends out a message to the routers to install a
- distribution tree from that source to all the listed receivers[ST-II]. The
- source-initiated approach is less desirable because it a) requires each
- source to deal with join requests from each receiver (or each receiver
- aggregate such as a domain), and b) requires that member-domains be listed
- explicitly. The last concern is the most significant because it means that
- the source-initiated scheme's overhead increases with the number of
- receivers (receiver domains) in the group.
-
- One limitation of the ESL approach, mentioned earlier, occurs when there
- are asymmetric paths. This occurs when the unicast path from a given
- receiver is different than the multicast path from the source to that
- receiver. Since RPM is used, the path chosen is the one from receiver to
- source. It is our opinion that this route asymmetry problem is NOT
- critical. In the future, if routing protocols become more load-sensitive,
- and as a result more routes are asymmetric due to asymmetric traffic
- loading, we may need to rely on other aspects of the adaptive routing
- service to address this problem.\Footnote{For example, if unicast routing
- could provide a special QoS route whose characteristic was that it
- represented the preferred path FROM the indicated destination, instead of
- TO the destination, then the ESL messages used in our protocol could be
- sent using that QoS, and the deficiency described here would be avoided. }
-
- ESL avoids explicit enumeration of receivers, but does require enumeration
- of sources. If there are very large numbers of sources sending to a group
- but the sources' average data rates are low, then the group can be
- supported with a shared tree instead which has less per-source overhead. If
- shortest path trees are used then when the number of sources grows very
- large, some form of aggregation or proxy mechanism will be needed; see
- section 6. We selected this tradeoff because in many existing and
- anticipated applications, the number of receivers is much larger than the
- number of sources. And when the number of sources is very large, the
- average data rate tends to be very low (e.g. resource discovery).
-
-
- 3. Protocol Description
-
- Below is a description of the protocol steps and messages.
-
- 3.1 Overview
-
- ESL-Join messages traveling up from receivers to the RP create a
- RP-rooted distribution tree that is used to distribute data packets from
- new sources to all receivers and from all sources to new receivers.
- ESL-Register messages traveling from sources to the RP causes the RP to
- send join messages upstream to the sources and thereby create
- distribution paths from the sources to the RP-rooted distribution tree.
- In this way, a shared tree is formed between the sources and the RP
- and the RP and receivers.
-
- If shortest path, source-specific, trees are to be used, then data packets
- >from new sources will trigger ESL-Join messages to travel up from receivers
- via their shortest paths to sources.
-
- | |
- |** MR-1 ************ MR-2 ******** MR-3 --|
- | . . *@ |-- Receiver-1-Ga
- Source-1 -| . . *@ |
- | . . *@ |
- | . . *@
- | RP ................. MR-8
- . . *@
- . . *@
- . . *@
- | . . *@ |
- |@@ MR-4 @@@@@@@@@@@@ MR-5 @@@@@@@@ MR-6 --|
- | \ / |-- Receiver-2-Ga
- Source-2 -| \ / |
- | \ / |
- | --------- MR-7 --------
-
-
- ... RP-rooted distribution tree
- *** Source-1 based distribution tree
- @@@ Source-2 based distribution tree
- MR Multicast Router
-
- 3.2 Receiver/Upstream messages
-
- This section describes the sequence of messages sent as receivers
- join a group, as well as the actions taken to establish distribution
- paths to the receivers.
-
- 1) Host sends IGMP-Report message identifying a particular group, G,
- in response to a directly-connected Router's IGMP-Query message. From
- this point on we refer to such a host as a receiver, R, (or member) of
- the group G.
-
- 2) When a designated router (DR) receives a report for a new group
- G it checks to see if it has RP address(es) associated with G. The
- mechanism for learning this mapping of G to RP(s) is
- somewhat orthogonal to the specification of this protocol;
- however, we require some mechanism in order for the protocol to work.
- At the very least this information must be manually configurable. In
- addition, as discussed in Section 7, we propose the use of a new
- IGMP-RP-report message that would allow hosts to inform their
- directly-connected ESL routers of G,RP(s) mappings. This is important
- for dynamic groups where hosts participate in special applications
- to advertise and learn of multicast addresses and their associated
- RP(s)
-
- A DR will identify a new group (i.e., one for which it has
- no existing multicast entries) as needing ESL support by checking if
- there exists an RP mapping. If there is no RP mapping provided in IGMP
- report messages, and there is no mapping provided in the appropriate
- configuration file, then the router will assume that the group is NOT
- to be supported with ESL. Even when a group has an associated RP, it
- may be that some outgoing and incoming interfaces do not require ESL,
- but are handled using a dense mode scheme such as MOSPF, DVMRP, or
- Dense mode ESL. In this case the router will flag individual
- interfaces as dense or sparse mode, to allow differential treatment of
- different interfaces. For the sake of clarity, we will ignore these
- added complexities throughout most of the protocol description.
-
- For the remainder of this description we will also assume a single RP just
- for the sake of clarity. We describe the direct extensibility to operation
- with multiple RPs later in the document.
-
- 3) The DR creates a multicast forwarding cache for (*,G) . The RP
- address is included in a special record in the forwarding entry, so
- that it will be included in upstream join messages. The outgoing
- interface is set to that over which the IGMP report was received from
- the new member. The incoming interface is set to the interface used
- to send unicast packets to the RP. A wildcard (WC) bit is
- associated with this entry.
-
- The DR sets an RP-timer for this entry. The timer is reset each time an
- RP-reachable message is received for *,G (see 3.3).
-
-
- 4) The router creates an ESL-Join message with the RP address in its
- join list with the WC bit set; nothing is listed in its
- prune list. The WC bit indicates that the receiver expects to receive
- packets from new sources via this path and therefore upstream routers
- should create or add to *,G forwarding entries. The WC bit also indicates
- that the particular IP address is being used as an RP and that the
- router with that address should send an RP-reachability message
- downstream; these messages are effectively sent periodically in
- response to the receipt of periodic join messages. The message is sent
- as an IP packet addressed to the next hop router upstream towards the
- RP; the payload contains the IGMP information Multicast-Address=G,
- ESL-join={WCbit}, ESL-prune=NULL.
-
- 5) Each upstream router creates or updates its multicast forwarding
- entry for (*,G) when it receives an ESL-Join with the WC bit set. The
- interface on which the ESL-Join message arrived is added to the list
- of outgoing interfaces for (*,G). As a result each upstream router
- between the receiver and the RP sends an ESL-Join message in which the
- join list includes the RP and the WC bit. The
- messages are sent using IP addressed to the next hop router used to
- reach the RP. The payload IGMP packet contains Multicast-Address=G,
- ESL-join={WCbit}, ESL-prune=NULL.
-
- The RP recognizes its own address and does not attempt to send join
- messages for this entry upstream. Because the RP recognizes itself as the
- RP it knows to send RP-reachability messages in response to the periodic
- join messages received from downstream. In addition, the incoming
- interface in the RP's *,G entry is set to null.
-
- 6) When an ESL-router has directly-connected members that want to join the
- group with shortest paths, the router notices data
- packets for G that are NOT sourced by an address for which it has a
- multicast forwarding entry. The router initiates a new multicast
- forwarding entry for (Sn,G), clears the "SPT-bit" for that entry, and sets
- a timer for the S,G entry.
-
- The router also triggers the generation of IGMP messages upstream. For
- example, an ESL-Join message will be sent upstream to the best next
- hop towards the new source, Sn, with Sn in the join list:
- Multicast-Address=G, ESL-join={Sn}, ESL-prune=NULL. The ESL-Join
- message that gets sent upstream toward the RP will have Sn in the
- prune list (at the point where the two upstream paths diverge) when the
- SPT bit on the DR's S,G entry is set:
- Multicast-Address=G, ESL-join={RP,*}, ESL-prune={Sn}.
-
- In order to
- avoid missing data packets the DR should send the ESL message
- toward the new Sn before sending the prune message toward the RP. The
- DR knows it is time to send the prune when it starts receiving
- new packets from Sn on the interface used to reach Sn. Therefore Sn is
- not included in the prune list sent toward the RP until the SPT bit is
- set for the S,G entry.
-
- When the Sn,G entry is created, the outgoing interface list is copied
- >from *,G. In this way when a data packet from Sn arrives and matches on
- this entry, all receivers will continue to receive sources packets along
- this path unless and until the receivers choose to prune themselves.
-
- Note that a DR may adopt a policy of not setting up a S,G entry
- (and therefore not sending an ESL-Join message toward the source)
- until it has received m data packets from the source within some
- interval of n seconds. This would eliminate the overhead of S,G state
- upstream when small numbers of packets are sent sporadically. However,
- data packets distributed in this manner may be delivered over the
- suboptimal paths of the shared RP tree.
-
- The DR may also choose to remain on the RP-distribution tree
- indefinitely instead of moving to the shortest path tree.
-
- 7) In the steady state each router sends periodic refreshes of ESL messages
- upstream to each of the next hop routers that is en route to each source,
- S, for which it has a multicast forwarding entry (S,G); as well as for
- the RP listed in the (*,G) entry. These messages are
- sent periodically to capture state, topology, and membership changes. An
- ESL message is also sent on an event-triggered basis each time a new
- forwarding entry is established for some new (Sn,G) (note that some damping
- function may be applied, e.g., a merge time). Optionally the ESL message
- could contain only the incremental information about the new source and
- only be sent to the next hop toward that source. ESL messages are not
- sent reliably; lost packets will be recovered from at the next periodic
- refresh time.
-
- The join list in an ESL-Join message sent to a neighboring router, X,
- includes an address for each source, S, for which:
- 1) there is a multicast forwarding entry (S,G), or S is listed as the
- RP-entry for (*,G); AND,
-
- 2) X is the next-hop router used to send unicast packets to S
- (or if S is a directly connected host, then include S if X is
- the DR for S's LAN), AND,
-
- 3) the outgoing interface list in the forwarding entry is NOT null.
-
-
- The prune list in an ESL-Join message sent to a neighboring router Y,
- includes an address for each source, S, for which:
- 1) there is a S,G multicast forwarding entry with, the SPT bit set, a
- null outgoing interface list and Y is the next hop to reach S (or, if
- S is a directly connected host, then include S if Y is the DR for S's
- LAN), and
-
- In addition, if Y is the next hop used to reach the RP, the prune list also
- includes an address for each source S for which:
- 1) there is a S,G multicast forwarding entry, the SPT bit is set, and
- Y is not the next hop used to reach S.
-
- ESL-Join messages are sent periodically and the join and
- prune lists are populated as specified above. In addition four
- events will trigger ESL-Join messages:
- 1) receipt of an IGMP report message for a new group (i.e., one for which
- the receiving router does not have any S,G or *,G entries) will trigger an
- ESL-Join message toward the RP with the RP address and WC bit set in the
- join list, and
-
- 2) receipt of an ESL-Join message for an S,G pair (including *,G) for
- which there is no current forwarding entries, will trigger an ESL-Join
- message toward S (or RP) with S (or RP with WC bit set) in the join list.
-
- 3) receipt of packet on the NEW S,G entry over the appropriate incoming
- interface triggers a) setting of the SPT bit, and b) sending a prune
- message up the RP tree.
-
- 4) when the outgoing interface list becomes null, indicating no more
- downstream receivers, a prune is sent upstream. We do not trigger prunes
- based on data packets. Data packets that arrive on the wrong incoming
- interface are silently dropped.
-
- Note that each source address listed in an ESL may be a specific IP
- address, or may indicate a subnet or a general aggregate. To support
- this generality in the future each ESL entry is represented by a {mask length,
- Address} pair. The distribution of mask information is described in
- Section 3.3 where reachability messages are described.
- The potential for using proxy or aggregate information is described
- briefly in Section 7.
-
-
- 8) Each router that receives an ESL message processes it as follows:
-
- a) notes the interface on which the ESL-Join message arrived, call it I.
-
- b) if one of the Si has has the WC bit set, and a *,G forwarding entry
- already exists, add I to the *,G forwarding entry and set the timer.
- If I is a new interface in the *,G forwarding entry add I to all other
- existing Si,G forwarding entries also, with the exception of those Si
- listed in the prune list. If the value of Si with the WC bit set is
- different from the RP-entry listed in the existing *,G forwarding entry
- then:
-
- i. if Si is greater than the listed RP-entry value,
- set RP-entry to Si,
-
- ii. if Si is less than the listed RP-entry value,
- leave the RP-entry as is. Do not reset the RP-entry
- timer. (These steps are taken so that in the case of
- multiple RPs, loops can be avoided in the RP-based
- shared tree. This is achieved by making sure that
- within any branch of the shared tree, routers will
- converge on using a single RP until it fails.)
-
- The incoming interface is set to the RPF interface to the RP in the *,G
- forwarding entries.
-
- c) for any Si without the WC bit set that is included in the ESL-join
- list, for which there is NO existing (Si,G) forwarding entry, the router
- initiates one. The outgoing interface is set to I, and the incoming
- interface is set to the interface used to send unicast packets to Si. IF
- the interface used to reach Si is the same as the outgoing interface being
- built (i.e., the interface on which the ESL-Join message arrived) this
- represents an error and the join should not be processed.
-
- d) for any Si, included in the ESL-join list, for which there IS an
- existing (Si,G) forwarding entry, the router adds I to the
- list of outgoing interfaces, IF I is not the same as the
- existing incoming interface; If I is the same as the existing
- incoming interface, the existing incoming
- interface takes precedence and the join is dropped.
-
- e) for each Si, included in the ESL-prune list, for which
- there is an existing (Si,G) forwarding entry, the router
- deletes I from the list of outgoing
- interfaces. If the router has a current *,G forwarding entry,
- and if an Si,G entry also exists then the
- forwarding entry is maintained for (Si,G) even if its outgoing
- interface list is NULL. If there is no (Si,G) entry, then one
- is created with the outgoing interface list copied from *,G, and
- the interface on which the prune was received is deleted. This
- acts as a negative cache so that packets from Si are
- not forwarded to the pruning receiver.
-
- 9) A timer is maintained for each outgoing interface listed in each S,G
- or *,G entry. The timer is set when the interface is added.
- The timer is reset each time an ESL-join message is received on that
- interface for that forwarding entry (i.e., S,G or *,G).
-
- When a timer expires, the corresponding outgoing interface is deleted
- >from the outgoing interface list. When the outgoing interface list is
- null a prune message is sent upstream and the entry is deleted after 3
- times the refresh period (i.e., 180 seconds).
-
-
- 3.3 Source/Downstream messages
-
- Two types of messages are sent downstream: Registers and RP-reachabilty
- messages.
-
-
- 3.3.1 Register messages
-
- 1) When a source, S, wishes to send to a multicast group, G, for the
- first time, S simply sends a data packet addressed to the group.
-
- 2) When a data packet from S addressed to G arrives at the first hop ESL
- designated router (DR), and the DR has no current forwarding entry for
- (S,G), the router looks up the RP(s) address(es) associated with G.
-
- The RP information may be configured or may be provided by a new
- IGMP-RP-report message. If no RP information exists, then the router
- assumes the group is handled as a dense group and simply sends the data
- packets out all non-incoming interfaces. The RP mapping function is only
- performed by the the first ESL router to see the source's packets before
- the *,G entry is established; i.e, the mapping is not performed by each ESL
- router on a distribution tree. The RP information should be cached for
- future use.
-
- 3) The router sends an ESL-Register message to the RP. The message
- indicates the group for which the source is registering, and has the WC
- bit set. Mask information for the source may be included.
-
- The original data packet is encapsulated inside the Register
- packet.
-
- The message is sent as a unicast packet to the RP; it is not processed
- by the intermediate routers. If there are multiple RPs associated with
- the multicast group, then the source sends a Register message to each of them.
-
- Subsequent data packets sent to the same group will trigger the same
- action until an S,G entry is set up in the first hop router in response
- to a join message received from downstream. The RP information should be
- cached so that multiple lookups can be avoided for subsequent data
- packets sent to the same group.
-
-
- 4) When a router (i.e., the RP) receives a Register message, the
- router
- a) decapsulates the data packet, and forwards it according its local
- *,G forwarding entry, and
-
- b) sets up an S,G forwarding entry with the outgoing interface list copied
- from the *,G outgoing interface list. The S,G entry is set up using the
- mask information, if provided, in the Register message. A
- timer is set for the S,G entry.
-
-
- The S,G entry causes the RP to send an ESL-Join message for the
- indicated group toward the source of the Register message. The
- ESL-Join message includes the source's address and mask information;
- note the source here is the source of the Register message, i.e., the
- source-host's directly-connected ESL router, NOT the source host
- itself. This message is triggered and processed like any other
- ESL-Join message by the intermediate routers, which either create or
- augment the S,G forwarding state in exactly the same way as was
- described in 3.2: the ESL-Join message's incoming interface is added
- to the outgoing interface list, and the incoming interface for the
- entry is set to the interface used to reach the source.
-
- Note that an RP may adopt a policy of not setting up a S,G entry (and
- therefore not sending an ESL-Join message toward the source) until it has
- received m Register messages (with encapsulated data packets) from the
- source within som interval of n seconds. This would eliminate the
- overhead of S,G state upstream of the RP when small numbers of packets
- are sent sporadically. However, data packets distributed in this manner
- may be delivered on very suboptimal paths because they travel all the way
- to the RP before being multicasted.
-
- 5) Once the ESL-Join messages have propagated upstream from the RP, data
- packets from the source will follow the S,G distribution path state
- established. The packets will travel to the receivers via the
- distribution paths established by the ESL-Join messages sent upstream
- >from receivers toward the RP. Multicast packets will arrive at some
- receivers before reaching the RP if the receivers and the source are
- both "upstream" of the RP.
-
- When the receivers initiate shortest-path distribution, additional
- outgoing interfaces will be added to the S,G entry and the data
- packets will be delivered via the shortest paths to receivers.
-
- 6) Data packets will continue to travel from the source to the RP(s) in
- order to reach new receivers. Similarly, receivers continue to receive
- some data packets via the RP tree in order to pick up new senders.
- However, when source-specific tree distribution is used, most data
- packets will arrive at receivers over a shortest path
- distribution tree.
-
- 7) Data packets travel from the source via the reverse shortest path
- tree rooted in the source because routers between the source and the
- receiver have a multicast forwarding entry for (Sn,G) whose outgoing
- interface list includes all the interfaces on which the routers
- received ESL-Join messages from downstream receivers.
-
- 3.3.2 RP-Reachability Messages
-
- 1) A router starts sending periodic "RP-reachability" messages
- downstream when:
- (a) it receives an ESL-Join message with its own
- address AND WC-bit set in the join list, and
- (b) the incoming interface on its (*,G) entry is Null.
- The first condition is to make sure that it is an RP.
- The second condition is to make sure that only the "dominant" RP
- will send RP-reachability messages, so the traffic can be minimized.
-
- This obviates the need to do any kind of special configuration of RPs;
- any router can be an RP since RP behavior is triggered by the protocol
- itself. A router is responsible for initiating RP-reachability messages
- to downstream nodes if it has a *,G entry with a NULL incoming interface.
-
- 2) The router sends the periodic RP-reachability messages out all the
- outgoing interfaces in the *,G entry. The period for this message is 90
- seconds. The messages are addressed to the 224.0.0.1 class D address and the
- message contents includes the RP and G and an optional list of
- source, mask information.
-
-
- 2) When a router receives an RP-reachability message for a group G it
- must compare the RP address listed in the message to the RP address
- listed in the current *,G RP-entry.
-
- If the RP listed in the message is greater than the RP listed in the *,G
- RP-entry, and if the next hop used to reach the listed RP is the same as
- the next hop used to reach the RP-entry, then the router replaces its
- current RP-entry with the RP address from the RP-reachability message
-
- This is necessary to eliminate routing loops that can occur in some
- instances when downstream receivers select different upstream RPs and the
- RP-centerod distribution trees overlap.
-
- If the RP listed in the message is less than the RP listed in the *,G
- RP-entry, OR if the RP-reachability message did not come in on the RPF
- interface to the RP listed in the message, then the message is not
- forwarded.
-
- In more detail, when a router receives an RP-reachability message it
- does the following; assume that router X receives an RP-reachability message
- of RP1 from incoming interface I.
- 1. Perform RPF check. If I is not the best next hop to RP1, drop this
- RP-reachability message.
- 2. Else, If the incoming interface of (*,G) is not NULL and not I,
- drop the RP-reachability message.
- 3. If the incoming interface of (*,G) is I,
- compare RP1 with the address in RP-entry, say RP2.
- If RP1 is larger than RP2, set RP-entry to RP1 and propagate
- the RP-reachability message downstream. Otherwise, drop the
- RP-reachability message.
- 4. If the incoming interface of (*,G) is NULL and WC-bit is set
- then this router is currently acting as an RP for G. In this case,
- compare RP1 with X. If RP1 is larger than X, set RP-entry to RP1, set
- the incoming interface to the RPF interfac used to reach RP1, and
- clear the WC-bit for that router. Also, propagate RP-reachability
- message downstream.
- Otherwise, if RP1 is less than X, drop the RP-reachability message.
-
- 4) If there are any *,G entries the message is forwarded with
- the same class D address out the outgoing interfaces from the
- G entries. If a downstream router does not have any *,G
- entries then the packet is dropped.
-
- When DRs with directly connected group members receive this message
- they reset their RP-timers on the RP-entry in *,G. This allows
- group-members' directly-connected ESL routers to detect when an RP
- becomes unreachable and trigger a join toward an alternate RP, if one
- exists.
-
- 5) The RP-reachability message may optionally contain Source/Mask
- information for (S,G) entries maintained by the RP.
- This mask information is optionally obtained via Register
- messages sent to the RP by sources' first hop routers. The
- masking information can be used by last-hop ESL routers to
- consolidate S,G entries, and consequently ESL-Join lists
- sent upstream.
-
- 3.4 Multicast Data Packet Processing.
-
- Data packets are processed in a similar manner to existing multicast
- schemes. An incoming interface check is performed and if it fails the
- packet is dropped, otherwise the packet is forwarded to all the interfaces
- listed in the outgoing interface list (whose timers have not expired).
- There are two exception actions that are introduced if packets are to be
- delivered continuously, even during the transition from a shared to
- shortest path tree. First, when a data packet matches on an S,G entry with
- a cleared SPT bit, if the packet does not match the incoming interface for
- that entry, then the packet is forwarded according to the *,G entry; i.e.,
- it is sent to the outgoing interfaces listed in *,G IF the incoming
- interface matches that of the *,G. In addition, when a data packet matches
- on an S,G entry with a cleared SPT bit, AND the incoming interface of the
- packet matches that of the S,G entry, then the packet is forwarded and the
- SPT bit is set for that entry.
-
- Data packets never trigger prunes . Data packets may trigger
- actions which in turn trigger prunes. In particular data packets from a
- new source can trigger creation of a new S,G forwarding entry. This
- causes S to be included in the prune list in a triggered ESL messages toward
- the RP; just as it causes S to be included in the join list in a
- triggered ESL message toward the source.
-
-
- 3.5 Packet Types
-
- RFC 1112 specifies two types of IGMP packets for hosts and routers
- to convey multicast group membership and reachability information.
-
- An IGMP Query packet is transmitted periodically by routers to ask
- hosts to report which multicast groups they are members of. An IGMP
- Report packet is transmitted by hosts in response to received
- Queries advertising group membership.
-
- This document introduces new types of IGMP packets that are used
- by ESL routers. The following packet format is used:
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |Version| Type | Code | Checksum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Group Address |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- %%DE Dino, please add Version/Type/TOS/Code that you proposed in comments
-
- Version
- This memo specifies version 1 of IGMP. Version 0 is specified
- in RFC-988 and is now obsolete.
-
- Type
- There are five types of IGMP messages:
-
- 1 = Host Membership Query
- 2 = Host Membership Report
- 3 = Router DVMRP Messages
- 4 = Router ESL Messages
-
- Code
- Codes for specific message types. Used only by DVMRP and ESL.
- ESL codes are:
-
- 0 = Query
- 1 = Register
- 2 = Join/Prune
- 3 = RP-Reachable
- 4 = Assert dense-mode ESL only
- 5 = Mode dual-mode ESL only
- 6 = Mode-Ack dual-mode ESL only
-
- Checksum
- The checksum is the 16-bit one's complement of the one's
- complement sum of the entire IGMP message. For computing
- the checksum, the checksum field is zeroed.
-
- Group Address
- In a Host Membership Query message, the group address field
- is zeroed when sent, ignored when received.
-
- In a Host Membership Report message, the group address field
- holds the IP host group address of the group being reported.
-
- In a Register, Join/Prune, Query, and RP-Reachable message,
- the group address field is zeroed when sent, ignored when
- received.
-
- 3.5.1 ESL-Register, ESL-Join, and Assert messages.
-
- The Register, Join/Prune and Assert messages have additional information
- appended to the fixed header:
-
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Reserved | Maddr Length | Addr Length | Num groups |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Multicast Group Address-1 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Number of Join Sources | Number of Prune Sources |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Join Source Address-1 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | . |
- | . |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Join Source Address-n |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Prune Source Address-1 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | . |
- | . |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Prune Source Address-n |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | . |
- | . |
- | . |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Multicast Group Address-n |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Number of Join Sources | Number of Prune Sources |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Join Source Address-1 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | . |
- | . |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Join Source Address-n |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Prune Source Address-1 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | . |
- | . |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Prune Source Address-n |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- Reserved
- Unused field, zeroed when sent, ignored when received.
-
- Addr Length
- The length in bytes of the encoded source addresses in the
- Join and Prune lists.
-
- Maddr Length
- The length in bytes of the encoded multicast addresses.
-
- Num Groups
- The number of multicast group sets contained in the message.
-
- Multicast group address
- For IP, it is a 4-byte Class D address.
-
- Number of Join Sources
- Number of join source addresses listed for a given group.
-
- Join Source Address-1 - n
- This list contains the sources that the sending router will forward
- multicast datagrams for if received on the interface this
- message is sent on. The address 0.0.0.0 indicates a join for
- all sources.
-
- For a Register message, the source address specifies the
- address(es) of the Rendezvous Point(s).
-
- Number of Prune Sources
- Number of prune source addresses listed for a given group.
-
- Prune Source Address-1 - n
- This list contains the sources that the sending router does
- not want to forward multicast datagrams for when received on the
- interface this message is sent on. The address 0.0.0.0
- indicates a prune for all sources.
-
- In Router messages, all source addresses will have the following
- format:
-
-
- <WC-bit><Mask Length><Address>
-
- <WC-bit> is a 1 bit value. If 1, packets should propagate to
- this address and a wildcard multicast entry should be built
- with interface information based on the receiving and sending
- interface for Router messages. If 0, the <Address> is a source
- address. The Address should be added to the address list
- associated with the wildcard multicast entry for the group.
-
- <Mask Length> is 7 bits. The value is the number of contiguous bits
- left justified used as a mask which describes the <Address>.
-
- <Address> is the length indicated from the "Addr Length" field
- at the beginning of the header. The <Mask Length> must be less than
- or equal to "Addr Length" * 8.
-
- A source address could be a host IP address:
-
- <0><32><192.1.1.17>
-
- A source address could be the RP's IP address:
-
- <1><32><131.108.13.111>
-
- A source address could be a subnet address:
-
- <0><28><192.1.1.16>
-
- A source address could be a general aggregate:
-
- <0><16><192.1.0.0>
-
- ESL messages are always sent as unicast IP addressed packets. These
- messages are sent towards the direction of the Join and Prune source
- addresses. This is achieved by doing a route lookup for each source
- address and IP addressing it to the next-hop router along the path to
- the source. Each router along the way does this until the destination
- is reached.
-
- %df - is this still true?
- Router messages may be data linked multicast when transmitted on
- subnetworks that support multicast.
-
- 3.5.2 RP-reachability message
-
- The RP-reachable packet format is as follows:
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |Version| Type | Code | Checksum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Group Address |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | RP Address |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Number of Entries |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Mask Length | Address ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .
- .
- .
-
-
- Group Address
- Group address associated with RP.
-
- RP Address
- The Rendezvous Point IP address of the sender.
-
- Number of Entries
- The number of {Mask Length, Address} entries in the message. This
- value is 0 if no addresses are included.
-
- Mask Length
- Number of bits in the network mask for the corresponding address.
-
- Address
- 4-byte IP address. The corresponding zero bits in the mask should
- be set to zero in the address.
-
-
- Each RP will send RP-Reachable messages to all routers on its
- distribution tree for a particular group. These messages are sent
- so routers can detect that an RP is unreachable. Routers that have
- attached host members for a group will process the message. On a
- multi-home stub subnetwork, the DR is responsible for processing
- the message. A router that processes the message is one that
- updates the RP address timer to indicate that the RP is still
- alive. Other routers that are on the RP-distribution tree
- propagate the message.
-
- The RPs will address the RP-Reachable messages to 224.0.0.1.
- Routers that have state for the group with respect to the RP
- distribution tree will propagate the message. Otherwise, the message
- is discarded.
-
- If an RP address timer expires, the DR should attempt to
- send an ESL join message toward an alternate RP provided for
- that group if one is available.
-
- 3.5.3 Other message types
-
- In a future version of this document we will specify a new
- IGMP message type that will allows hosts to advertise a list of 1 to n
- RP addresses associated with a particular group address.
-
- 3.5.4 Examples
-
- Following are examples of source address encodings in Register, Join/Prune,
- and RP-reachability messages:
-
- Register scenario:
- A host (131.108.1.1) sends a multicast datagram to 224.1.1.1. The
- first hop designated router (131.108.1.2) for the attached LAN,
- sends an ESL-Register message to the RP (131.108.10.2). The multicast
- datagram is encapsulated in the Register message.
-
- IP Header:
- Source IP address: 131.108.1.2
- Destination IP address: 131.108.10.2
- IGMP Header:
- IGMP Type: ESL
- IGMP Code: Register
- Group Address: 0.0.0.0
- ESL Header:
- Maddr Length: 4
- Addr Length: 4
- Num Groups: 1
- Multicast Group: 224.1.1.1
- # Join/Prune Sources: 1/0
- WC-bit: 0
- Mask Length: 24
- Address: 131.108.1.0
- Encapsulated datagram:
- Source IP address: 131.108.1.1
- Destination IP address: 224.1.1.1
-
- Join scenario:
- The RP (131.108.10.2) sends an ESL-Join upstream towards the source
- (131.108.1.0) in response to the above Register message recieived.
- The RP's next-hop to reach 131.08.1.0 is 131.08.20.2.
-
- IP Header:
- Source IP address: 131.108.10.2
- Destination IP address: 131.108.20.2
- IGMP Header:
- IGMP Type: ESL
- IGMP Code: Join/Prune
- Group Address: 0.0.0.0
- ESL Header:
- Maddr Length: 4
- Addr Length: 4
- Num Groups: 1
- Multicast Group: 224.1.1.1
- # Join/Prune Sources: 1/0
- WC-bit: 0
- Mask Length: 24
- Address: 131.108.1.0
-
- RP-reachability scenario:
- Now that 131.108.10.2 knows it is an RP (because it received a
- Register message), it must send ESL RP-Reachability messages
- downstream on the RP-distribution tree for 224.1.1.1. It will
- send the following packet on all outgoing interfaces of the
- (*, 224.1.1.1) entry. Each router on the path will build a new
- IP header with its own IP address as the source IP address.
-
- IP Header:
- Source IP address: 131.108.10.2
- Destination IP address: 224.0.0.1
- IGMP Header:
- IGMP Type: ESL
- IGMP Code: RP-Reachability
- Group Address: 224.1.1.1
- ESL Header:
- RP Address: 131.108.10.2
- Number of Entries: 1
- Mask Length: 24
- Address: 131.108.1.0
-
-
-
- 4. Robustness Features
-
- 4.1 Lost ESL messages
-
- The protocol is fairly robust to lost control messages.
- If an ESL-Register message gets lost then data packets will continue to
- be encapsulated in subsequent ESL-Register messages until the RP
- initializes an S,G entry and the associated ESL-Join messages
- propagate up to the Source.
-
- If an ESL-Join message is lost then for the remainder of
- the refresh period, packets will not be forwarded on the new path, or
- will continue to be forwarded until the refresh is sent.
-
- %%Editorial note: we are not at all fixed on these timer values...
- It is recommended that ESL messages be transmitted at a
- rate of 60 seconds. Information that is cached should be timed out
- after 3 times the transmission period if no ESL message for the entries
- have been received. When a forwarding entry has no more outgoing
- interfaces it is deleted and a prune can be sent upstream (or the
- router can wait until the next period when the ESL list will no longer
- include the Source for the deleted entry and the state will eventually
- be timed out upstream).
-
- 4.1 Multiple Rendezvous Points and RP failure scenarios
-
- If there is one RP then there is no concern about sources and
- receivers actually being able to rendezvous, but there is a
- reliability issue. If there are more than one RPs then each receiver
- still joins to a single RP, but each source must register to EACH and
- EVERY RP. In other words there are multiple RP distribution trees, and
- so long as each source sends its packets to all of them, receivers
- need only join to one.
-
- When the RP fails or becomes unreachable by receivers, members who have
- already joined will continue to receive packets from sources that had
- previously sent to the group and for which the receivers had already
- switched to the SPT (assuming the SPT is not affected by the same failure
- as makes the RP unreachable). However, new members will send toward the
- unreachable RP and will NOT be successfully joined to the group unless
- their join packets reach existing SPTs of the sources before they reach the
- RP. New sources will attempt to register and send to the RP. Their packets
- will either not arrive at the RP in which case they will only be forwarded
- to receivers who are upstream of the RP with respect to the source, or
- their packets will get to the RP but will not reach downstream
- receivers. In the latter case, the SPT from the source to receivers
- will never be set up even if the paths that make up the SPT are
- available. This leads to the motivation for employing multiple RPs.
-
- Unreachable RPs are detected using the RP reachability message. When a
- *,G entry is established by a router with local members, a timer is set.
- The timer is reset each time an RP reachability message is received. If this
- timer expires, the router looks up an alternate RP for the group, sends a
- join toward the new RP. A new *,G entry is established with the incoming
- interface set to the interface used to reach the new RP. The outgoing
- interface list includes only those interfaces on which IGMP Reports for
- the group were received. (Other outgoing interfaces may no longer be
- valid since the router in question may not be on the shortest path
- between the downstream branch and the new RP. If the router is on this
- shortest path as well, it will eventually receive an explicit join from
- that downstream branch as the last hop routers take the same action).
-
- When multiple RPs are used, each source registers and sends data
- packets towards each of the RPs, but Receivers only join toward a
- single RP. If one of the RPs fails, receivers that joined to that RP
- will stop receiving RP-reachability messages and will start sending
- joins to one of the alternative RPs. Sources do not need to take
- special action. When an RP is unreachable it will not receive the
- source's Register messages and therefore will not respond with joins
- and so the outgoing interfaces in *,G pointing toward the unreachable
- RP will time out; without any explicit action on the part of the
- source.
-
- Because each receiver's directly connected router selects an RP
- independently, it is possible for routers on the same part of the
- distribution tree to specify different RPs while both are still
- available. This can lead to looping in some topologies. To avoid looping,
- RP address information carried in ESL-Join and RP-reachability messages is
- examined to converge to a common RP (the larger numbered RP dominates).
-
-
- 4.2 Unicast routing changes
-
-
- When unicast routing changes an RPF check is done and all affected expected
- incoming interfaces are updated. If the new incoming interface appears in
- the outgoing interface list, it is deleted from the outgoing list. The
- previous incoming interface may be added to the outgoing interface list by
- a subsequent join from downstream. Joins received on the current incoming
- interface are ignored. Joins received on new interfaces or existing
- outgoing interfaces are not ignored. Other outgoing interfaces are left as
- is until they are explicitly pruned by downstream routers or are timed out
- due to lack of appropriate join messages.
-
- The ESL-router must send an ESL-Join message out its new interface to
- inform upstream routers that it expects multicast datagrams over the
- interface. It must send an ESL-Prune message out the old interface, if
- the link is operational, to inform upstream routers that this part of
- the distribution tree is going away.
-
- If the unicast route goes unreachable, all multicast entries for (S,Gi)
- should be modified to have null outgoing interface lists, however the
- entries should not be deleted immediately; this causes periodic
- prunes to be sent and multicast packets to be discarded. The entry
- should be kept alive for the remainder of the timeout lifetime. This
- helps to eliminate transient multicast routing forwarding loops. If the
- unicast route has a new next-hop interface, the (S,Gi) entries must be
- updated.
-
-
- The following diagram shows how a multicast forwarding loop can be
- avoided. Assume all LANs have members for a given group. The arrows
- represent the expected interface each router will receive multicast
- datagrams on, as well as the outgoing interfaces with respect to Source.
-
- ---------- ----------
- | |
- ^ ^
- +----+ +----+
- | R1 | >-----> | R2 |
- +----+ +----+
- ^ v |
- | | |
- | | | |
- | | +----+ |
- | +--------> | R3 | >-|
- | +----+ |
- | | |
- | |
- Source --------+
-
- If the path to Source changes, R1 and R3 may converge on the new path
- before R2, e.g., R1 uses its link to R3 as its expected incoming
- interface and R3 uses its new shortest path link to Source. In this
- state,
-
- ---------- ----------
- | |
- ^ ^
- +----+ +----+
- | R1 | >-----> | R2 |
- +----+ +----+
- | ^ |
- | | |
- | | | |
- | | +----+ |
- | +--------- | R3 | >-|
- | +----+ |
- | ^ |
- | |
- Source --------+
-
- R1 and R3 were informed about a topology change for Source and changed
- their incoming interfaces. Both R1 and R3 send joins up their new
- incoming interfaces. R1 also deleted its outgoing interface to
- R3 because this interface is used as an incoming interface. R2,
- however has not been informed about the topology change.
-
- If R1 received a multicast datagram on its old expected interface, it
- would silently drop it. This would happen if upstream routers from R1
- to the Source had old routing information. If upstream routers have
- converged on a new path all datagrams will enter this part of the
- network through R3, and it would forward appropriately to its LAN and
- to R1 which expects it. R1 would forward to its LAN as well as R2. R2,
- using out of date expected incoming interface, would also forward the
- packet. Once R2 is informed of the topology change, it will change its
- expected incoming interface to R3 and will send a prune to R1 and a
- join to R3. The final state would look like:
-
- ---------- ----------
- | |
- ^ ^
- +----+ +----+
- | R1 | ------- | R2 |
- +----+ +----+
- | ^ ^
- | | |
- | | ^ |
- | | +----+ |
- | +--------< | R3 | >-|
- | +----+ |
- | ^ |
- | |
- Source --------+
-
-
- More generally, if unicast routing changes and the router in question has not
- converged then one of two situations exists.
- In the first, one or more of the existing
- outgoing interfaces may no longer reach any receivers. In this case data
- packets are forwarded until they reach a router that has converged and
- finds that the incoming interface for the packet is not right; in which
- case the packet will be dropped.
- The cost of this transient condition is the continued sending of data
- packets down links that do not lead to receivers; this can occur for the
- duration of a refresh period.
-
- The second situation occurs when data packets begin to arrive over an
- incoming interface other than the one listed in the corresponding S,G
- entry. This occurs when upstream has converged and the router in question
- has not. In this case, data packets will be dropped instead of delivered
- for a time less than or equal to the convergence time + refresh period.
-
-
-
-
- 5. ESL Routers on multi-access subnetworks
-
- There are several multiaccess subnetwork configurations that require
- special consideration.
-
- 5.1 Designated Routers
-
- +----+ +----+
- | R1 | | R2 |
- +----+ +----+
- | |
- -------------------------
- | | |
- H1a H2a H3b (Hxy - Host x is a member to group y)
-
- When there are multiple ESL routers on a multi-access network, only a
- single router must be responsible for the following actions:
-
- o Soliciting group membership from hosts and sending ESL-Join messages.
- o Sending Register messages on behalf of a connected
- source host when it sends a multicast packet.
- o Forwarding multicast packets onto the multi-access network.
-
- This is done with a simple Designated Router (DR) election.
- Neighboring routers send ESL Query packets to each other. The
- packets are sent to 224.0.0.2. The largest IP addressed system will
- assume role as DR. The default transmission interval is 30 seconds. A
- router should detect the DR as unreachable when it does not receive a
- Query in 3 times the transmission interval. There will be one DR that
- supports all groups per multi-access network. The DR sends periodic
- IGMP Host Query packets to 224.0.0.1 soliciting hosts to respond. The
- DR sends multicast packets using the data link address that is mapped from
- the IP Class D multicast address.
-
- 5.2 Multiaccess subnetwork as a transit network
-
- The following diagram shows the case where a multi-access network is
- used as a transit network.
-
-
- | |
- | |
- +----+ +----+ |
- | R1 | | R2 | |
- +----+ +----+ |
- | | | Downstream to group members
- | v |
- ------------------------- |
- | | |
- v v |
- +----+ +----+ V
- | R3 | | R4 |
- +----+ +----+
- | |
- v v
- | |
-
-
- When a LAN is used as a transit network among routers, it is required
- that a single router forward multicast packets to downstream routers.
- This router is known as the Router-DR. A single Router-DR is chosen
- by the receipt of ESL messages unicasted by each of the downstream
- routers. All routers that use the LAN as their incoming interface for
- multicast packets from a particular source, will expect it from a single
- Router-DR. This router is the one they use for sending unicast packets to
- the source. All routers will select the same router. In the case of
- equal-cost unicast paths, the largest IP addressed next-hop is used.
- The Router-DR forwards multicast packets using data link address that
- is mapped from the IP Class D multicast address.
- The multicasted data packets will be seen by all other routers connected
- to the LAN. For each router, if it has an entry for S,G or *,G and the
- LAN is the indicated incoming interface then the router will forward the
- packet. If there is no such entry or if the incoming interface is not the
- LAN then the packet will be silently dropped.
-
-
-
- In the above diagram, both R3 and R4 have downstream group members. They
- will send their ESL-Join messages towards the RP, and therefore select
- either R1 or R2 to send the message. Assume R2 is the shorter path to
- the RP. Later, when multicast datagrams travel from the RP, they will
- come through R2 only, avoiding duplicates on the LAN. The same procedure
- takes place when the source-based distribution tree is built. Whatever
- router on the LAN is chosen is also responsible for delivering multicast
- packets to host members, if they were present.
-
-
- 5.3 Parallel routers
-
- The following diagram illustrates the behavior of routers that are in
- parallel and the interaction of DR and RP routers.
-
- S
- |
- ------------------------------------- LAN1
- | | |
- +----+ +----+ +----+ DR
- | R1 | RP | R2 | | R3 |
- +----+ +----+ DR +----+
- | | |
- ------------------------------------- LAN2
- | |
- H1a H2a
-
-
-
- Assume R1 is the RP for Ga, R3 is DR for LAN1, R2 the DR for LAN2, and
- that LAN1 is the preferred path among routers to reach the RP.
- If the receivers of Ga join first, R2 will send an ESL-join to R1, the
- RP, out LAN1. R2 builds a multicast entry for (*,Ga) with incoming
- interface LAN1 and outgoing interface LAN2.
- R1 receives the ESL-Join message and builds a multicast entry for
- (*,Ga) with incoming interface set to {} (since it is the RP) and
- outgoing interface set to LAN1.
-
- When S sends a multicast datagram, the DR for LAN1, R3, will encapsulate
- the data packet in a Register message and send it to R1, the RP, out LAN1.
-
- The RP, R1, decapsulates the data packet and forwards it
- onto LAN1, as indicated in the outgoing interface list of R1's *,Ga
- entry.
-
- The RP, R1, then processes the Register part of the message and sets
- up an S,Ga entry with LAN1 as the incoming interface and a null
- outgoing interface list; the outgoing interface list is copied from
- *,Ga but LAN1 is NOT included in S,Ga outgoing interface list because
- LAN1 is the incoming interface.
-
- R1 triggers a join toward S. When R1 does a lookup on S it finds that
- S is on a directly connected LAN and sends the join to the DR for that
- LAN, i.e., R3.
-
- When the DR, R3, receives the join it builds an S,Ga entry with LAN1
- as the incoming interface. Since there is no *,Ga entry, the outgoing
- interface list is set to null.
-
- Subsequent data packets from S for Ga that arrive from LAN1 will
- be silently dropped by R1 and R3 since the S,Ga entries have null
- outgoing interface lists.
-
- For this example we assume that shortest path trees are desired. In
- this case, when R2 receives the multicast datagram from S and finds
- that the longest match is on *,Ga (i.e., there is no S,Ga entry in
- R2), R2 creates an S,Ga entry, sets the incoming interface to LAN1
- (since that is the interface used to send packets to S), and copies
- the outgoing interface list from the existing *,Ga entry (in this case
- LAN2). R2 forwards this and subsequent multicast datagrams for S,G
- onto LAN2.
-
- When R2 creates S,Ga, it also triggers an ESL-join message
- to the next hop to S; in this case S is directly connected so
- the ESL-join is sent to the DR for that LAN, R3. R3 receives the join
- but does not add LAN1 to its outgoing interface list for S,Ga because LAN1 is the
- incoming interface for S,Ga.
-
-
-
- 5.4 Leaf-Router Prunes
- LAN connected routers must also detect when there are
- no more downstream routers. The following protocol is used: when a router
- whose incoming interface is the LAN has all of its outgoing interfaces go
- to null, the router multicasts a prune message for S,G onto the LAN. All other
- routers hear this prune and if there is any router that has the LAN as
- its incoming interface for the same S,G and has non-null outgoing
- interface list, then the router sends a join message onto the
- LAN to override the prune. The join should
- go to single upstream router that is the right previous hop to the source or
- RP; however, at the same time we want others to hear the join so that
- they supress their own joins. For this reason the join is data link
- multicasted, with the IP address set to the
- upstream router.
-
-
- 6. Interoperation with non-ESL networks/regions
-
- A network or collection of networks should be able to choose whether
- to use ESL or traditional multicast to join a distribution tree,
- depending on the density of the membership in that region.
- If the density is high then there is no need to carry ESL
- messages and state overhead within the region; it is more
- efficient to use RPM or flood membership reports since in general
- most links will be on a path from some source to some destination and the
- overhead for these traditional IP multicast mechanisms is not a
- function of the number of sources.
-
- In addition, we wish to interoperate with networks that do not have
- hosts and routers modified to generate and interpret ESL-Join
- messages.
-
- The basic problem of splicing these "IP clouds" onto ESL trees is
- identifying which border router for the IP cloud should be the entry
- point for data packets from a particular source, and therefore which
- sources individual border routers should put in their join and prune
- lists.
- This is analogous to the LAN case when there is more than one router
- serving it. The designated router is the one that takes responsibility
- for serving the members on the LAN.
-
- If the Border routers are running IBGP then they have the information
- necessary to determine which BR should include a particular host in
- its join list. Similarly if the BRs are running OSPF then the
- information can be computed. However, if the domain is running DVMRP
- or some other scheme, there may need to be some additional mechanism
- employed in the BRs. This is an open issue still to be resolved in
- order to achieve maximum interoperation with existing networks.
-
- An additional problem arises when interoperating with a non-ESL cloud.
- Namely when a receiver decides to join a group inside of a cloud in
- which there are no other members then the BRs of that cloud must
- be notified in order to trigger sending of an ESL-Join join message.
- In the case
- of MOSPF new group membership is advertised to backbone routers but not
- necessarily to all BRs. In the case of DVMRP and most other distance vector
- IGPs membership is not advertised at all. Therefore in both cases, some
- additional mechanism is needed.
-
- We can solve this problem in a manner similar to the multi-access LAN
- case.
- 1. Two internal (to the cloud) multicast groups are created
- Multicast-Reporters (MR) and All-ESL-BRs.
-
- 2a. If the cloud runs MOSPF then one (or a small number for reliability) of
- the backbone routers joins the MR group.
- 2b. If the cloud runs DVMRP, then ALL internal routers that
- have the potential of being DRs for a network must join the MR group (i.e.,
- any router that will process an IGMP report).
-
- 3. All BRs that speak ESL join the All-ESL-BRs group AND the MR group.
-
- 4. Members of All-ESL-BRs do a Designated BR election among themselves
-
- 5. The resulting DBR sends an IGMP-query to the MR group.
-
- 6. Members of MR respond by sending IGMP-report messages to the
- MR group. Members of MR listen to these reports and supress
- sending reports for groups that have been reported by other routers.
-
- As a result, all ESL-BRs hear of all groups for which internal members
- exist. Based on this information, and information obtained from IBGP or
- OSPF, the BRs can determine which of them should send an ESL-Join
- message to the RP for each group for which there is a local member.
- Note that DBRs are source, group specific.
-
- We will describe two scenarios to illustrate the interoperability
- issue: one case where the source of a multicast datagram is in the
- non-ESL cloud and receivers in a group are outside of the cloud, and
- one case where the source is outside of the cloud and receivers are
- inside the cloud.
-
- ---------------
- / \
- / BR1 \ -----...-------- RP
- | | /
- | S | /
- | | /
- | | .
- \ BR2 / -------------.
- \ BR3 / .
- --------------- /
- | /
- +---------...------------
-
- S sends a multicast packet that gets to all border routers. Protocols
- such as MOSPF and DVMRP will cause the multicast packet to hit all
- border routers. If all border routers know of each other the one with
- the shortest path to S is elected the Border DR. In case of tie, the
- largest IP addressed router becomes Border DR. The Border DR sends the
- Register message to the RP. All others discard the multicast packet.
- The Border DR sends the multicast packets along the path to the RP after
- join messages for S,G propagate back to the DBR to establish S,G state.
-
- In the receiver in the cloud case:
-
- ---------------
- / \
- / BR1 \ -----...-------- RP
- | H1a | /
- | | /
- | H2a | /
- | | .
- \ BR2 / -------------.
- \ BR3 / .
- --------------- /
- | /
- +---------...------------
-
-
- Border Routers need to know which one sends ESL-Join messages to RP
- for which groups. If MOSPF is running all borders know of each
- other, they can determine which one is closer to the RP. The one
- closer, sends the ESL-Join message. Similarly, iBGP provides the BRs
- with the information to determine whether each particular BR has the
- preferred route to the RP.
-
- Similarly, once a data packet from a new source arrives at the BR, it
- must determine which border router
- is closest to that source. If the BR itself is the closest, it
- forwards the packet internally to the multicast group, sets up the
- source-specific forwarding entry, and sends an ESL-join message toward
- the source. Otherwise, it encapsulates the packet and unicasts it to
- the correct border, and that border router takes the same action.
-
- In summary, borders need to learn about each other and their respective
- routes to RPs or sources using one of the following:
- o OSPF or IS-IS
- o IBGP or IIDRP
- o Configured mesh of tunnels and unicast routing is running over
- tunnels.
-
-
-
-
- 7. Design Issues
-
- 7.1 Comparison with Core Based Tree
-
- CBT was proposed to address similar scaling problems, however it has
- several differences; some represent functional differences and some
- engineering tradeoffs.
-
- 7.1.1 Tree Types
-
- The first major issue is that CBT imposes a
- single shared tree for each multicast group. We justified our desire
- to avoid this scenario earlier. CBT must rely on more "cores" in
- order to obtain efficient distribution paths. This means that the
- core(s) must be selected carefully to avoid excessively high delay
- distribution paths. Even if the core is placed optimally, there is
- still the significant issue for continuous media types of
- concentrating all traffic onto a common data distribution tree.
-
- In ESL If some application does not want shortest path tree
- distribution then a host does not have to add all new sources to its
- ESL. This fact must also be signaled to the routers so that they also
- operate in shared-tree mode.
- This will cause the RP-based tree to continue to be used as the
- distribution tree. In that way an application can choose a group tree
- instead of a shortest path tree. Actually first hop routers can make
- this decision independently, and a host could even choose differently
- for different sources. However, if RP-based distribution is maintained in
- any cases then the choice of RPs is more critical than when RPs are
- used only as a transition path to shortest path trees.
-
- 7.1.2 Group Specific State
-
- There are also protocol engineering differences between the two. One
- of these issues is a tradeoff between requiring group specific state
- on the routers in between sources and the RP, vs. carrying an option
- in all DATA packets sent to the group. In CBT, data packets travel from
- the source to the CBT with an option attached. THis allows the
- packets to be sent initially towards the core, by non-CBT routers, and
- then to be routed along the CBT once they hit a router that is on the
- core tree. In ESL we have chosen to use a Register packet and
- establish explicit S,G forwarding entries so that data packets need not
- require as much processing.
-
-
- 7.1.3 Soft state vs. explicit reliability mechanism
-
-
- CBT uses explicit hop by hop mechanisms to achieve reliable delivery
- of control messages. ESL uses periodic refreshes as its primary means
- of reliability. This approach reduces the complexity of the protocol
- and covers a wide range of protocol and network failures in a single
- simple mechanism. On the other hand, it can introduce additional
- message protocol overhead.
-
- 7.1.4 Effect on Host Service Model
-
- CBT requires that hosts be modified to participate in the CBT protocol.
- ESL proposes to make use of optional new IGMP Report messages that
- include a list of zero to n RPs; however hosts do not otherwise have to
- participate directly in the ESL protocol.
-
- 7.1.5 Incoming interface check on all multicast data packets
-
- If multicast data packets loop the result can be severe; unlike unicast
- packets, multicast packets fan out each time they loop. Therefore we
- assert that all multicast data packets should be subject to an incoming
- interface check comparable to the one performed by DVMRP and MOSPF. In
- order to do this check *,G state can only be used downstream of the RP.
- As a consequence, in any particular router on the shared tree, a specific
- S,G entry must be maintained for sources that are upstream of the RP
- relative to that router.
-
-
- 7.2 Selecting and Identifying RPs
-
- An RP for a particular multicast group can be any IP-addressable
- entity in the internet. However, it is most efficient and convenient
- for the RP to be the directly-connected ESL router of the members of
- the group. If an RP has local members of the group then there is no
- wasted overhead associated with sources continually sending their data
- packets to the RP since it needed to be delivered there anyway for
- delivery to those members.
-
- Nevertheless, we need not be overly concerned with placement of the RPs
- when shortest path trees are used because the RP will
- not remain on the distribution path for most receivers, unless it happens
- to be centrally located. Obviously, pathological cases should be avoided,
- such as putting the RP on the other end of a very narrow link that is
- exceeded by the datarate of sources. The RP address can be configured or
- can be dynamically discovered by mapping from the multicast address, query
- of a directory service, or from information obtained via new ESL-RP-Report
- messages. The mapping of G to RP addresses should be cached.
-
- While the mapping of multicast addresses to RP addresses is an open
- issue in the long term, in
- the short term we will implement two mechanism. The first approach is
- to simply manually configure the mapping. The second approach is to
- allow hosts (both sources and receivers) to inform routers of the
- mapping using a new ESL-RP-Report message. The latter approach is
- needed to support dynamic groups that hosts advertise and discover by
- participating in a special application, e.g., the session directory
- (sd) tool developed by V. Jacobson. Advertising hosts will advertise
- RP addresses along with the multicast address and other hosts that
- wish to send to or join the group will send an ESL-RP-Report message
- with the RP address(es) in response to IGMP Queries.
-
- The DNS is not a general solution because it is not appropriate for
- advertising dynamic information quickly as is needed for dynamic
- multicast groups. In the future if the DNS is used for multicast address
- advertisement, RP addresses can be advertised along with them.
-
-
- 7.4 Separating receiver and sender roles
-
- We chose to continue with the design philosophy of IP multicast for
- two reasons. The first is that in order to interoperate seamlessly
- with IP multicast we needed to maintain the separation between
- receivers and senders. The second is that the separation allows us
- to build a protocol that has less overhead per receiver by introducing
- more overhead per source.
-
- While some applications might like to have explicit information about
- all receivers in a group, the aggregation mechanisms proposed for very
- large groups would interfere with the utility of this information
- anyway; i.e., explicit receiver information would only tell which
- domains were receiving the packets, not which hosts within those
- domains. It seems that some other mechanism is needed if an
- application really wants to enforce access control on the multicast
- group. This is a subject for further study.
-
- In many applications the source should be a receiver as well in order
- to obtain feedback and facilitate debugging\cite{Van}. For this reason
- we might add an optimization whereby an IGMP Register message that is
- appropriately flagged, would be interpreted and processed as both an
- IGMP Register and an IGMP join message.
-
- 7.5 State overhead
-
- State overhead is of considerable concern given the large number of
- multicast groups that will exist and the large number of potential
- sources that do exist.
- The ESL protocol described here entails the following state.
- 1. On the RP downstream tree (RP and routers downstream of RP) there is:
- a *,G state for each Group, and negative or positive cache information
- for each Si,G when SPT's used..
-
- 2. On the SPT's Si,G for subset of Si's whose SPTs pass through that
- particular router
-
- 3. On the upstream RP Tree (between source and RP, what CBT calls offtree),
- there is also Si,G state for each source whose shortest path to the RP
- passes through the particular router.
-
- In the periphery the number of sources with SPTs through a router is not so
- large. The number of groups may still be large but is still not as large as
- in center of the network.
-
- However, a very large number of sources' SPTs
- pass through "central routers" and a very large number of groups have
- distribution trees that pass through the central routers as well.
- Source specific state is unavoidable if you want SPTs. If you do not need
- SPTs or do not need all of the tree to be SPT, use *,G instead.
- We should investigate a situation in which periphery routers switch to
- their SPT interfaces but central
- routers stick with *,G RP tree entry. We need to answer two questions:
- What kind/quality of trees do we
- end up with? and what is the implication for traffic Concentration in
- center of network, even given the greater aggregate BW found there.
-
- There remains the open issue of aggregation across groups as well. Scott
- Brim has proposed some mechanisms for dense mode operation.
- CBT does avoid group specific state on the routers that lie between
- sources and the shared tree for that group by employing and processing
- an option in all data packets sent to sparse multicast groups.
- For now, we wish to avoid interfering with data packet processing and pay
- with state. But we must due further studies to determine how many
- groups can we supported before the shared-tree mechanism or cross-group
- aggregation is mandated?
-
-
- 7.6 Aggregation of information in ESL
-
- There are several motivations for aggregating source information beyond
- the subnet level supported in the current specification; the
- most important are ESL message size and the amount of memory used for
- routing forwarding entries.
-
- One possibility is to use the highest level aggregate available for an
- address when setting up the multicast forwarding entry. This is
- optimal with respect to forwarding entry space. It is also optimal
- with respect to ESL message size. However, ESL messages
- will carry very coarse information and when the messages arrive at
- routers closer to the source(s) where more specific routes exist there
- will be a large fanout and ESL messages will travel toward all members
- of the aggregate which would be inefficient in most/many cases.
-
- If ESL is being used for inter-domain routing, and routers are
- able to map from IP address to domain identifier, then one possibility
- is to use the domain level aggregate for a source in ESL messages (AS
- numbers or RDI's). Then the ESL message will travel to the BR(s) of
- the domain and the BRs can use the internal multicast protocol's
- mechanism for propagating the join within the domain (e.g. send
- appropriate LSA in MOSPF or register a "local member" and do not prune
- in the case of RPF). However this approach requires that it is both possible
- and efficient to map from IP to domain address when processing data
- packets, as well as control packets.
-
- %%Editorial Note: the following is a very gross high level description
- %%of Vans scheme. It is just a placeholder for the difinitive paragraph
- %%that I will eventually extract from him.
- Another possibility is to use proxies as suggested by V. Jacobson.
- In this case within ESL clouds, ESL messages need only refer to proxies
- for sources outside the cloud. In this scheme BRs would join an ESL
- tree externally and inject themselves as sources internally. When data
- packets arrived, the data packet would be forwarded into the cloud and
- routers would see a new source. They would then need to determine
- which is the entry BR for the particular source and forward the packet
- on the multicast tree associated with that BR. The router could cache
- a forwarding entry for the new source in order to avoid repeating this
- step on each data packet. To create efficient multicast distribution
- trees that do not generate duplicate packets this scheme requires that internal
- routers be able to map from an IP address to the entry BR used by
- that IP address as source. If such a mechanism is not available, possible
- approximations may be employed that map packets based on the previous hop
- router. This technique is currently being
- developed and would be deployable as an addition to the current protocol
- without affecting the protocol specification per se.
-
- In the absence of aggregation or proxy techniques, when the number of
- sources get to some threshold value (to be determined), receivers
- could compromise the quality of the distribution tree in exchange for
- accommodating large numbers of unaggregatable sources. In particular
- receivers could continue to receive packets over the group tree
- instead of moving them off to a shortest path tree. For example,
- Receivers could send a wildcard IGMP to an RP to maintain distribution
- of all sources packets to that multicast address via the RP. While
- this would result in a suboptimal distribution tree, it would avoid
- explicit enumeration of sources. Alternatively, the receiver could
- send a wildcard with explicit sources listed in the prune portion of
- the list. This would allow the receiver to get shortest path delivery from
- a subset of the sources.
-
-
- %%DE added
- One problem with leaving selection of shared vs. shortest path trees to
- the receivers is that the burden of excessive S,G entries will most
- likely be in the center of the network far away from receivers. In this
- case routers should be able to act unilaterally to decline requested
- establishment of new S,G entries. If a router does not process a join
- then the downstream receivers will not receive packets over the shortest
- path. Assuming a strategy is used whereby receivers do not prune the
- shared tree until packets arrive on the shortest path tree, then
- receivers will simply remain on the shared tree until more state becomes
- available on the shortest path tree.
-
-
- 7.7 Interaction with policy based routing
-
- ESL messages and data packets will travel over paths that include policy so
- long as the policy does not preclude them, to the same extent that unicast
- routing does. In addition, in the future we will construct a special ESL
- message type that embeds a Source Demand Route (SDRP route) and thereby
- causes the ESL message and the multicast forwarding state to be on an
- alternative distribution tree branch.
-
- To obtain policy sensitive distribution of multicast packets we need to
- consider the paths chosen for forwarding ESL-Join and Register messages.
-
- If the path to reach the RP or some source is indicated as being the
- appropriate QOS and indicated as being
- symmetric then ESL routers can determine that if they forward joins
- upstream that the data packets will allowed to travel downstream.
-
- This implies that BGP/IDRP should carry two QOS flags: symmetry flag and
- multicast willing flag. The former if set indicates that that each AD hop
- has local route selection policies that allow data to flow in either
- direction. THe latter flag indicates that each AD hop on the path has a
- local transit policy that indicates that multicast packets are allowed.
- NOTE: there are two types of symmetry. One indicates that it is not in
- violation of transit policies to allow data to flow in both directions so
- even if route selection is not symmetric, if mcast forwarding entries
- point along the reverse route it does not violate policy. THe second type
- of symmetry indicates that packets are in fact routed symmetrically--i.e.,
- if R1 forwards Packets from S destined for D out over an interface to
- R2, then the route is truely symmetric if R2 forwards packets from D to S
- over its interface to R1. For ESL we only need the former information.
-
-
- If the generic route computed by hop-by-hop routing does not have the
- symmetry and mcast bits set, but there is an SDRP route that does, then
- the ESL message should be sent with an embedded SDRP route. This option
- needs to be added to ESL join messages. Its absence will indicate
- forwarding according to the router's unicast
- routing tables. Its presence will indicate forwarding according to the SDRP
- route. This implies that SDRP should also carry
- symmetry and mcast QOS bits AND that ESL should carry an optional SDRP
- route inside of it.
-
-
-
- 7.8 Interaction with Receiver Initiated reservation setup such as RSVP
-
- Once the SP distribution tree has been established
- RSVP reservation messages follow the reverse of senders path
- messages and the senders path messages will travel according to the
- state that ESL installs. However, one wants to avoid switching
- reservation-oriented routes so the receiver could initially receive
- all packets via the RP distribution tree and after some delay it could
- send ESL messages to establish the SP tree and then establish
- reservations over that tree. The source's path message
- would travel first via the RP path, then to avoid setting up a
- reservation on the RP path, the receiver would send its IGMP
- message BEFORE it sends out its reservation message and wait for
- another path message to travel over the new SP.
-
- In summary we expect that this receiver initiated routing is well
- suited to receiver initiated reservations since if a reservation is
- blocked the previous router or the receiver can select an alternative
- reverse path to the particular source(s). This is also a subject for
- future work that will affect the use of the protocol, and not the
- protocol itself.
-
-
-
- 7.9 Dense Mode
-
- We can use similar IGMP extensions to support a mode of multicasting
- that is good (more efficient than ESL-sparse) for forwarding to
- receivers that densely populate a region. Clouds might run this form
- of ESL internally as their internal multicast mechanism; or it could
- support dense-inter-domain groups.
-
- In this model routers run RPF (forward out all interfaces, except the
- incoming, if a packet arrived on the outgoing interface used to get to
- the source). Directly connected
- routers run IGMP query and report and when they have no members
- for a group and receive packets for it, they send IGMP prune messages
- that consist of ESL messages with Prune lists only. Similarly, when a
- router gets a packet on a source that is NOT its outgoing interface to
- the source, that router sends an ESL message with prune information
- only. ESL records prune information and propagates it upwards if
- entire downstream branches prune themselves. Periodically the prune
- information is timed out and the packets are sent again and downstream
- routers must resend the prune messages.
-
- A draft specification of dense mode ESL is available from Dino Farinacci.
-
- Running dense mode internally with ESL sparse mode outside has all
- the same problems of DVMRP internally--they need to run BGP or tunnels
- to identify appropriate BRs, and the need to add a mechanism for
- alerting BRs to new group members.
-
-
-
-
-
- 7.10 Open Issues
-
- The open issues associated with ESL are:
-
- 1. Aggregation of source lists via use of proxies.
-
- 2. Discovering RP addresses: new IGMP-Report-RP message.
-
- 3. Aggregating group specific state along the shared tree.
-
- 4. Dense to Sparse to Dense transition issues; see dense mode document.
-
- 5. Deciding when to switch from shared to shortest path trees.
-
-
-
-
- Acknowledgments
-
- Tony Ballardie, Scott Brim, Jon Crowcroft, Paul Francis, Ching-Gung
- (Charley) Liu, Liming Wei and Lixia Zhang provided detailed comments on
- previous drafts. The authors of CBT and membership of the IDMR WG provided
- many of the motivating ideas for this work and useful feedback on design
- details.
-